Statistical Matching of Household Survey Files
نویسنده
چکیده
Suppose that two files are available with some overlapping variables (U) and other variables unique to each of the two files (S and T ). Each of the variables (S, T, U) can be multivariate. Occasionally, we can match the same individuals (firms, establishments, or households) from the two files to create a larger file which contains fuller information. This procedure is called the exact matching, and is quite popular for establishments and firms because we have large-scale surveys of firms and establishments. On the other hand, many household surveys have relatively small samples. In such cases, statistical matching technique can be used, where a household in a file is matched to another in the second file if they are judged to be similar in some sense. Statistical matching has been used for a while, at least in the U.S. and other countries, to combine different files of household surveys, but we have little experience with Japanese household surveys. According to Rodgers [4] who surveyed examples of statistical matching, the statistical matching technique has been used without a rigorous background as a substituting method to exact matching until recently when some attempts started to remedy this theoretical weakness. Similarly, U.S. Department of Commerce [9] claims that one of the reasons to use statistical matching is the legal restrictions on exact matching due to privacy act. In such a case, one should rely on a statistical matching even when exact matching is available. The purpose of this paper is to examine whether statistical matching is effective to extract fuller information out of some existing files. For this purpose, there exist suitable statistical surveys in Japan, namely, “Family Income and Expenditure Survey” and “Family Savings Survey,” which we will call FIES and FSS for short. FIES is conducted monthly and covers approximately 8000 households, while FSS is an annual survey covering a portion of the first survey. Thus, in this experiment the variable S represents monthly income and consumption, U represents the annual income and other household attributes such as the number of family members and their ages, and T represents assets and liabilities. We applied various matching techniques to these household surveys that actually share a portion of households. Since we can exactly match the common part of the files and can find the true structure of all variables (S, T, U), we can examine the effectiveness of various statistical matching methods. In our earlier studies, [11, 10], we compared the exactly matched file with statistically matched files using some different methods. In those experiments, we used the common part of the two files (using n3 = 1740 households shown in Figure 1) and compared basic multivariate statistics like correlation coefficients.
منابع مشابه
Statistical Matching of Survey Data for the Analysis of Spending Patterns and Poverty
BRIEF INTRODUCTION The first aim of this research project is to realize a probabilistic matching among data sets on households’ income, consumption, lifestyle habits and perceived difficulties in their daily life. The Italian National Statistical Institute (ISTAT) releases the following survey data: EU-SILC (Statistics on Income and Living Conditions), HBS (Household Budget Survey) and ADL (Asp...
متن کاملStatistical Matching in ‘Labour Force’ and ‘Time Use’ Surveys
Abstract. In the National Statistical Systems, some parts of the official statistics required for development planning and evidence base management are collected through sampling survey. Due to the high cost of sampling process, as well as nonresponses, it may not be possible to collect all of the expected variables in a survey.In order to obtain a comprehensive and complete source o...
متن کاملStatistical match of the March 1996 Current Population Survey and the 1995 National Health Interview Survey.
OBJECTIVES Statistical matching is a method used to combine two files when it is unlikely that individuals on one file are also on the other file. The objectives of this report are to document and evaluate statistical matches of the March 1996 Current Population Survey (CPS) and the 1995 National Health interview Survey (NHIS) and give recommendations for improving future matches. The CPS-NHIS ...
متن کاملImpact of Small-Holders’ Cattle Fattening on Household Income Generation in Fadis District of Eastern Hararghe Zone, Oromia, Ethiopia
At the household level, livestock plays a critical economic and social role in pastoralists and at the household level, livestock plays a critical economic and social role in pastoralists and smallholder farm households. The objectives of this study were to analyze factors affecting participation in cattle fattening and its impacts on household income in Fadis district of Eastern Hararghe. Both...
متن کاملAn unconstrained statistical matching algorithm for combining individual and household level geo-specific census and survey data
The Population Census is an important source of statistical information in most countries that is capable of producing reliable estimates of population characteristics for small geographic areas. One limitation of a census is that there are many population characteristics that cannot be collected due to respondent burden or cost. This means that statistical agencies have to conduct population b...
متن کامل